NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Grey-Box Machine Learning Prediction of Parallel Application Scaling

Alasandagutti, Akhil; Bridges, Patrick G; Estrada, Trilce (December 2025, Proceedings of the 32st IEEE International Conference on High Performance Computing, Data, and Analytics)

Accurate prediction of parallel application performance in HPC systems is essential for efficient resource allocation and system design. Classical performance models estimate of speedup based on theoretical assumptions, but their applicability is limited by parameter estimation, data acquisition, and real-world system issues such as latency and network congestion. This paper describes performance prediction using classical performance models boosted by a trainable machine learning framework. Domain-informed machine-learning models estimate the overhead of an application for a given problem size and resource configuration as a coefficient of the estimated speedup provided by performance laws. We evaluate this approach on two HPC mini-applications and two full applications with varying patterns of computation and communication and also evaluate the prediction accuracy on runs with varying processors-per-node configurations. Our results show that this method significantly improves the accuracy of performance predictions over standard analytical models and black-box regressors, while remaining robust even with limited training data.
more » « less
Full Text Available
Scalable, High-Fidelity Monitoring of Application Communication Patterns in Vernier

https://doi.org/10.1145/3731599.3767520

Dominguez-Trujillo, Jered; Schafer, Derek; Shipley, Riley; Marshall, Ryan; Bacon, Nicholas; Moraru, Maxim; Shipman, Galen; Skjellum, Anthony; Bridges, Patrick (November 2025, ACM)

Full Text Available
Understanding GPU Triggering APIs for MPI+X Communication

https://doi.org/10.1007/978-3-031-73370-3_3

Bridges, Patrick G; Skjellum, Anthony; Suggs, Evan D; Schafer, Derek; Bangalore, Purushotham V (September 2024, Springer Nature Switzerland)

Full Text Available
Scaling Laws for the Workload Throughput of Emerging Heterogeneous Clusters

https://doi.org/10.1109/CCGRID64434.2025.00025

Alasandagutti, Akhil; Suetterlein, Joshua; Firoz, Jesun; Young, Stephen; Manzano, Joseph; Stewart, Jason R; Bridges, Patrick G; Estrada, Trilce; Barker, Kevin (May 2025, IEEE)

Not AvailableNext-generation HPC clusters are evolving into highly heterogeneous systems that integrate traditional computing resources with emerging accelerator technologies such as quantum processors, neuromorphic units, dataflow architectures, and specialized AI accelerators within a unified infrastructure. These advanced systems enable workloads to dynamically utilize different accelerators during various computation phases, creating complex execution patterns. The performance of the workloads can therefore be impacted by many factors, including how the accelerators are shared, their utilization, and their placement within the system. Moreover, effects such as the system and network state due to the overall system load can significantly impact the job completion rate. Understanding, identifying, and quantifying the impact of the most critical factors (e.g., the number of allocated accelerators) will help decide the investment decisions for accelerator acquisition and deployment that can improve the overall system throughput. This paper extensively studies these complex interactions among advanced accelerators within an HPC cluster and various workloads. We introduce a novel analytical model which predicts the speedup of a workload given an accelerator/system configuration. This model can be used to quantify the effect of augmenting additional accelerators on job performance running on an HPC cluster. We validate the model using both simulated and real environments.
more » « less
Full Text Available
nidiamcl/stream-graph: Initial Public Release

https://doi.org/10.5281/zenodo.10631952

Vaquera, Nidia; Estrada, Trilce; Jafari Khouzani, Soheila; Bridges, Patrick G. (February 2024, Zenodo)

This is the initial public release of the NSF funded PASCAL-G algorithm, which includes the MPI implementation we developed.
more » « less
sjafari2/K8sKafkaPipeline: K8s-Kafka-DataStreaming-Pipeline

https://doi.org/10.5281/zenodo.10631950

Jafari Khouzani, Soheila; Vaquera, Nidia; Estrada, Trilce; Bridges, Patrick G. (January 2024, Zenodo)

This is the initial public release for a funded project by NFS which developes the Kafka Pipeline orchestrated in Kubernetes to run a data streamiong in a real-time fashion.
more » « less
Quantifying and Modeling Irregular MPI Communication

https://doi.org/10.1109/CCGrid59990.2024.00065

Woods, Carson; Schafer, Derek; Bridges, Patrick G; Skjellum, Anthony (May 2024, IEEE)

Full Text Available
Evaluating the Viability of LogGP for Modeling MPI Performance with Non-contiguous Datatypes on Modern Architectures

Bacon, Nicholas; Bridges, Patrick G.; Levy, Scott; Ferreira, Kurt; Bienz, Amanda (January 2023, EuroMPI'23: Proceedings of the 30th European MPI Users' Group Meeting)

Modern architectures and communication systems software include complex hardware, communication abstractions, and optimizations that make their performance difficult to measure, model, and understand. This paper examines the ability of modified versions of the existing Netgauge communication performance measurement tool and LogGOPS performance model to accurately characterize communication behavior of modern hardware, MPI abstractions, and implementations. This includes analyzing their ability to model both GPU-aware communication in different MPI implementations and quantifying the performance characteristics of different approaches to non-contiguous data communication on modern GPU systems. This paper also applies these techniques to quantify the performance of different implementations and optimization approaches to non-contiguous data communication on a variety of systems, demonstrating that modern communication system design approaches can result in widely-varying and difficult-to-predict performance variation, even within the same hardware/communication software combination.
more » « less
Full Text Available
Cryptocurrency Fraud and Code Sharing Data Set and Analysis Code

https://doi.org/10.5281/zenodo.7268194

Haskins, Keira; Vasek, Marie; Bridges, Patrick (January 2022, Zenodo)

This release covers the state of the data and associated analysis code for determining code sharing between cryptocurrency codebases funded through the end of the original NSF CRII award. This material is based on work supported by the National Science Foundation under Grant CNS-1849729.</p>
more » « less
SAMPRA: Scalable Analysis, Management, Protection of Research Artifacts

https://doi.org/10.1109/eScience51609.2021.00028

Bridges, Patrick G.; Akhavan, Zeinab; Wheeler, Jonathan; Al-Azzawi, Hussein; Albillar, Orlando; Faustino, Grace (September 2021, 2021 IEEE 17th International Conference on eScience (eScience))

Full Text Available

« Prev Next »

Search for: All records